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Abstract 

We consider the computability of entropy and information in classical Hamiltonian 
systems. We define the information part and total information capacity part of 
entropy in classical Hamiltonian systems using relative information under a com- 
putable discrete partition. Using a recursively enumerable nonrecursive set it is 
shown that even though the initial probability distribution, entropy, Hamiltonian 
and its partial derivatives are computable under a computable partition, the time 
evolution of its information capacity under the original partition can grow faster 
than any recursive function. This implies that even though the probability measure 
and information are conserved in classical Hamiltonian time evolution we might not 
actually compute the information with respect to the original computable partition. 
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1 Introduction 

In statistical mechanics, entropy is one of the most important concept. Entropy 
is the measure of uncertainty, and statistical mechanics can be viewed as the 
best theory we can get with the constraint of uncertainty or partial information 
p1[2] . Even though the Hamiltonian time evolution conserves the probability 
and probability measure, the second law of thermodynamics states the entropy 
is nondecreasing in time. From the information theoretical perspective this 
may imply that information is lost during the computation of Hamiltonian 
time evolution. With rapid advances of numerical computation of physical 
systems, up to which extent we can actually compute the entropy and keep 
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the information during its time evolution became an important and interesting 
problem. 

Computability means that there is an algorithm of calculating a quantity with 
a Turing machine up to arbitrary precision. Originally this algorithm issue 
is related to the Hilbert's plan to prove or disprove all statements derived 
from axiomatic systems in a systematic way. But in 1930s, Godel showed that 
in an axiomatic system which is strong enough to express natural numbers 
there exist unprovable statements [3j. Turing applied this to the programs 
and showed there exist problems which are not solvable by algorithms [1]. 
With the advances of algorithmic information theory, G. Chaitin showed that 
in a formal system with n bits of axioms it is impossible to prove that a 
particular binary string is of Kolmogorov complexity greater than n + c [5j. 

In analysis and differential equations what kind of quantities can be com- 
puted is also researched [B]. In the wave equations and ordinary differential 
equations Pour- El, I. Richards and Zong showed that a computable initial con- 
dition may evolve to a noncomputable solution at the later time [7f8|9] . and 
discussed how these phenomena would be related to the actual computation 
with Turing machine [10]. The undecidability and computability in physical 
systems are also studied pTfT^ . C. Moore showed a Hamiltonian system can 
be mapped into a Turing machine, and where the trajectories are passing can 
be mapped into the Halting problem. Z. Xia [13] showed that a gravitational 
system may have non-collision singularity which makes the system not com- 
putable. Noncomputability of topological entropy in various systems are also 
researched [T¥|[TS] . Recently D. Graga et.al. [TB|IT7] showed that an ordinary 
differential equation can be mapped into a Turing machine. 

In this article we apply computability approach [TSfTU] to the entropy and 
information of a probability distribution in classical Hamiltonian systems. We 
define the entropy of a continuous probabihty distribution through a discrete 
computable partition in the phase space. This entropy we define are divided as 
two parts, one representing information and the other representing information 
capacity. We show that even though the initial entropy and Hamiltonian is 
computable the time evolution of entropy may not be computable. 



2 Entropy, information and information capacity 

Let us first define entropy and information for the discrete probabilities and 
probability distributions. We follow Shannon's definition [20]. Shanonn en- 
tropy for discrete probabilities Pi, P„, ... is 
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s = -Y,pjogp,. 

i 



(1) 



If we take the base of the logarithm as 2 then the unit of entropy is bits. This 
entropy is a measure of uncertainty or the abihty to store information. If we 
have an unknown digit X which can be either or 1 with probability 1/2 
each, we have 1 bit of entropy. The system has one bit of uncertainty or the 
capacity to store one bit of information. If the unknown digit X is identified 
as 0, the probability for is 1 and probability for 1 is 0. From Eq. ([1]) entropy 
becomes ( OlogO is considered as the limit value) and we say we gain 1 
bit of information and entropy is reduced to bit. 

Let us apply Eq. ([T]) for continuous probability distribution inside a phase 
space Q given by the momenta and coordinates (p, q). Consider the proba- 
bility distribution function p(p, q, t) of a particle in the phase space, which 
satisfies Liouville's equation. Suppose that p(p, q, t) has a finite support. To 
define probabilities and actually compute them continuous phase space and the 
probability distribution are discretized. Suppose that we discretize the phase 
space by countable number of cells with the same volume /i. The probability 
distribution is also discretized by 



P. ^ (2) 

where the integral at RHS of Eq. ([2]) is over the ith cell, pi and /i satisfies the 
relation 



^Pi/i=l. (3) 

i 

Using Shannon's definition, the entropy of the system S{p,fi) is 
5'(p,/i) = -J^PilogPi = -J2pip\ogpip 

i i 

= - J2piP'^ogpi -J2pip\ogp = - J2pill\ogpi -logp (4) 

i i i 

where pi = pa and p = p/a. The scaling constant a is inserted to make the 
argument of log dimensionless. As the cell size p becomes smaller and smaller, 
the first term in RHS (right hand side) of Eq. (jlj) converges to the phase space 
integral — J2i Pi log(pia)p ^ — J dQ p log(pQ;) if the integral exists. The second 
term diverges as log(a/p). These terms can be interpreted as follows. 

If a is chosen as the volume of probability distribution's support, the first 
part in Eq. (jl]) is always nonpositive. This term is called negative of so-called 
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relative entropy or Kullback-Leibler divergence [2T]. In general the Kullback- 
Leibler divergence is defined for two probability distributions P{x) and Q{x) 
as Dkl{P\\Q) = / P{x)\og{P{x)/Q{x))dx and always nonnegative. For our 
choice of a the first term in integral limit is always nonpositive and becomes 
zero only when the probability distribution is uniformly distributed on its 
support. Since the negative entropy decreases uncertainty it can be interpreted 
as the information we gain for the probability distribution p with respect to 
the uniform distribution 1/a. This term is also referred as 'physical entropy 
|22] ' or 'coarse-grained entropy' over measure [23]. In the integral limit (often 
setting a = 1) / dflplog{l/ p) is called a 'differential entropy'. The second 
term is the total information capacity, which is positive when a/p > 1. This 
term depends on how fine the measures are. The information is minimum 
when p is uniform distribution 1/a on the support a. In descrete case the 
information is maximum logA^ when p is discrete delta {p = N for one cell 
and zero for all others), and diverges as the distribution goes to the Dirac 
delta distribution. This coincides with the fact that in general infinite digits 
are needed to specify a real number. The entropy and information we defined 
in Eq. are subjective. First they depend on the chosen partition, in this case 
discrete grid size p which determines the coarse grained precision. Second they 
depend on the scale a, which determines how the we divide the information 
part and information capacity part. 

With this definition of entropy and information we consider the entropy of 
probability distribution with Hamiltonian time evolution with respect to the 
initial discrete partition. Since it is a classical Hamiltonian system, as time 
changes the probability density moves like an incompressible fluid in phase 
space, i.e. if one follows the time evolution of a point in phase space the density 
at the representation point remains constant and the measure is conserved. In 
actual computation of time evolution of initial probability distribution, one 
first discretize the probability density and evolve this discretized probability 
density with time (in most cases the probability distribution at later times 
can be only known numerically). 

Suppose that at time t = we make a discrete partition of phase space. This 
discrete partition divides the phase space with countable number of cells. Let 
us call the initial ith discretized probability density and ith partitioning cell as 
Pi(0) and Ci(0), like in Eq. ([2]). Then we compute the time evolved, discretized 
probability density. In most cases the exact analytic form of p{t) is not known, 
so the discretized probability in time t is obtained by the time evolution of 
Pi{t), which is the discretized probability density at t = 0. As time passes the 
original cell Cj(0) deforms to Ci{t), but the discretized probability distribution 
inside the deformed cell is still pi{0). The deformed cell Ci{t) will be spread 
over the original discretized partition. If our computing precision is fixed, then 
the new discretized probability density is computed by the same discretized 
cells at t = 0. Let us write the new discretized probability distribution within 



4 



the initial ith cell as pi(t) (see Fig. [T]). We have 



P,(t)=p,(t)/i = ^a,,(t)p,(0)/x (5) 
j 

where aijit) is given by 

, , volume of 00) n C 
= volume of a(0) " 

This probability distribution averaging (coarse graining) is the place where 
the information is lost due to the finite information capacity. 

From Eq. we have the relation 

0<a,,(t)<l, Y.(^^^it) = J2a,,it) = l. (7) 



After one time step, the new entropy S{t) computed with Pi{t) using fixed 
precision Ci(0) is 



S{t) = -J2Pi{t)\ogm = -^5:a.,(t)p,(0)/ilog( ^a,,v(t)p>(0)/i 

i i j 

= -im«ijW^i(0)/^log( ^aij/(t)pj/(0)) -log/i 



where Eq. ([3]) and Eq. ([7]) is used. Since the function /(x) = xlog(Ax) with 
A > is a convex function and the convex function satisfies Jensen's inequality 

f(J2aiXi)<^aif{xi) for all ttj > 0, (9) 



we have 



aij{t)pj{Q)fi\og i J2 aij'{t)pj,{0) 

> -EE«*^Wpj(o)/iiogp,(o) = -5:p,(o)/iiogp,(o) (10) 



We see that the information capacity part of S{t) in Eq. ([8]) is the same, 
but the negative entropy (information) part of S{t) is always greater or equal 
than the information part of S{t). This means that the information is always 
same or lost due to the coarse graining. One way to avoid the information 
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loss is using finer partition. If tlie entropy is calculated for each fractional 
probabilities of Pij(t) = aij(t)pj{0)fl, we have 



Sfinerit) = -Y^Y^^^ji^) log^ijW = " ^ H WPj (0)/i log(aij- (0)/i) 

j i j i 

= - %(i^)Pi(0)/i \og{aij{t)pj{0)fi) 



= -IIPi(0)/^logp»(0) -log/i + ^Pi(0)/i -^aij(t)logay(t) (11) 



The first term in RHS of Eq. (ITT]) is information, which is the same as before. 
But the total information capacity is increased by 



where — J2j O'ijit) ^ogaij{t), which is always nonnegative since aij{t) > 0, repre- 
sents the entropy increase (information capacity increase) due to the finer par- 
tition or resolution. For example in baker transformation this term is log 2 = 1 
bit for each discrete time step. 

Now we ask the question of computation of entropy and its time evolution. 
As stated before, the total entropy is a subjective quantity which depends 
on the partition we choose. Given the computable partition and computable 
initial conditions, can we compute the information and the total information 
capacity needed to keep the information during time evolution? To answer 
this question, we first need definitions about computability. 



3 Computability preliminaries 

The following definitions, theorems and examples are from the book of M. B. 
Pour-El and J. I. Richards [18]. Here the term recursive function means it can 
be implemented and calculated by a Turing machine. N denotes the set of 
non-negative integers. 

Definition 1 A sequence {vk} of rational numbers is computable if there 
exists three recursive functions a, b, s from N such that 



Definition 2 A sequence {r^} of rational numbers converges effectively to 

a real number x if there exists a recursive function e : N — > N such that for all 
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for all k 
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k > e{N) implies \rk — x\ <2 



-N 



Definition 3 A real number x is computable if there exists a computable 
sequence {vk} of rationals which converges effectively to x. 

We now define computable functions. For simplicity we first consider the case 
where the function / is defined on a closed bounded rectangle P in W^. Specif- 
ically P = {oj < X < hi, 1 < i < q} is called computable rectangle if and hi 
are computable reals. 

Definition 4 Let P CM.^ be a computable rectangle. A function / : ^ M 
is computable if: 

(i) f is sequentially computable; i.e. f maps every computable sequence 
of points Xk e I'^ into a computable sequence {f{xk)} of real numbers; 

(a) f is effectively uniformly continuous , i.e. there is a recursive function 
d : N — > N such that for all x,y & I'^ and all N: 

\x-y\< l/d{N) implies \f{x) - f{y)\ < 2"^. 

Now the function in MS is considered. 

Definition 5 A sequence of functions /„ : ^ M is computable if: 

(i) for any computable sequence of points x^ e M.^, the double sequence of reals 
{/„(xfe)} is computable; 

(a) there exists a recursive function (i:NxNxN— >-N such that for all 



\x-y\< l/d{M, n, N) implies \ fn{x) - fn{y)\ < 2 ^ for all x,y e Ifj, 

where = {-M < Xi < M,l < i < q} . 

Theorems about computability of integrals of functions. 

Theorem 1 Let I'^ be a computable rectangle in , and let fn'-I'^^Mbea 
computable sequence of functions. Then the definite integrals 



M,n,N: 




II 



form a computable sequence of real numbers. 
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Theorem 2 Let f be a computable function on a computable interval [a, b] . 
Then the indefinite integral 

X 

j f{u)du 

a 

is computable on [a, h] . 

Now we define the recursively enumerable nonrecursive set. 

A set A C N is called recursively enumerable if A = or A is the range of 
a recursive function a. In other words, we can compute a(0), a(l), a(2)... step 
by step using a Turing machine. 

A set A C N is called recursive if both A and its complement N — A are 
recursively enumerable. 

A fundamental and important theorem of logic is that 

Theorem 3 There exists a set A C N which is recursively enumerable but 
not recursive. 

If a set A is recursively enumerable but nonrecursive, then we have a recursive 
(or computable) procedure to get the elements a(0), a(l), a(2), ... sequentially, 
but we have no recursive (or computable) procedure to tell an arbitrary num- 
ber a G N belongs to A or not. We do not know how long we should compute 
the sequence to see a appears. This is expressed in the following lemma. 

Lemma 1 (Waiting lemma). Let a : N N 6e a one to one recursive function 
generating a recursively enumerable nonrecursive set A. Let w{n) denote the 
"waiting time" 

w{n) = max{m : a{m) < n}. 
Then there is no recursive function c such that w{n) < c{n) for all n. 

One example of recursively enumerable nonrecursive set is the set of Halting 
programs. 

Next theorem is about the convergence of sequence of functions. The proof is 
in [18]. 

Theorem 4 ( Closure under effective uniform convergence ) Let fnk : /'^ ^ M 
be a computable double sequence of functions such that fnk fn as k oo, 
uniformly in x, effectively in k and n. Then {/„} is a computable sequence of 
functions. 

Now we show an example of a function which is not bounded by any recursive 
function, which will be used later in section 4. 
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Example Let a : N — N be a one to one recursive function generating 
a recursively enumerable nonrecursive set A. We assume ^ A. Then the 
function f{z) = J2m=Q ^"^ / ^i™)"^ is an entire function but not bounded by 
any recursive function. 

In this example, the sequence of Taylor coefficients {l/a(m)"^} is computable. 
And this series is uniformly convergent on any compact disk < M} where 
M is a positive integer. To see this we note that there are only finitely many 
values of a(m) with a{m) < M. For all other a(m) > M + 1, and inside the 
disk of {\z\ < M} sum of the other terms containing only a(m) > M + 1 are 
bounded by J2M"^/{M + 1)™. So / is uniformly convergent on any compact 
disk {\z\ < M}. 

But the sequence of values /(O), /(I), /(2), ... are not bounded by any recursive 
function. For positive real argument f{x) is larger than any single term in its 
Taylor series. For one term m = w{n) where w{n) is the waiting function of 
the sequence a(n), We have 



/(2n)>(— -) >( — I = 2"^ = 2"'(") > w(n). (13) 
\a[m) J \ n J 

Hence /(2n) > w{n) and w{n) is not bounded by any recursive function, so 
f{z) is not bounded by any recursive function. 

With these preliminaries, next section we construct an example in which the 
time evolution of entropy is not computable. 



4 Computability of time evolution of entropy 



In this section we construct a Hamiltonian system, in which the Hamiltonian 
and its partial derivatives, initial probability distribution and information are 
computable under a computable partition but the time evolution of entropy 
under the original partition grows faster than any recursive function. 

To construct our Hamiltonian and probability distribution we first define a 
pulse function 



{x) = < 



g-x2/(l-s2) -1 < X < 1, 

(14) 

0. otherwise 



This function is in and has the support [—1, 1] (Fig. [2]). We define the 
normalization constant of (p{x) as A''^ = 1/ j\(f){x)dx = 0.828569... . is 
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computable, since it is an integral of computable function under a bounded 
interval by theorem [H 



We define another pulse function 



, (pinti^^ix + 5/16)) for a; < 0, 
^inti-2\x - 5/16)) for < x 



where 0i„t(x) is given by 



^ ,x \ N^Jo (j){x)dx ioT x> -1, 

<Pint{x) = \ (16) 

for a; < —1. 

4>int{x) is for X < — 1, increases smoothly {C°° way) from to 1 for — 1 < 
X < 1 and 1 for 1 < x. This is a computable function by theorem [21 The shape 
of ip{x) is shown in Fig. [31 It is a C°° function, which increases from to 1 
in way on the interval (—3/8, —1/4), constant value 1 between —1/4 and 
1/4, and decreases from 1 to in C°° way on the interval (1/4, 3/8). Otherwise 
it is 0. Both (f){x) and tp{x) are computable functions. 

Using above pulse functions (pix) and ^/'(x) we construct the Hamiltonian and 
probability distribution function. We consider the 6 dimensional phase space, 

(p,q) = (pi,P2,P3,gi,g2,g3)- 

The Hamiltonian is constructed by 



H=Y.HUp,ci) (17) 

m=0 

where Hm is defined as 



Hm = m(e^2pigi - 52)^^(93 - m). (18) 

Since each support of Hm are in m — 3/8 < qs < m + 3/8 (none of them 
overlap), for any computable point (p, q) we can make a ball centered at that 
point with radius and for sufficiently large N this ball contains at most 
one Hm, which is a computable function. With this fact and definition [SI we see 
that if is a computable function in M^. Also all and are computable 
and we can choose a neighborhood of computable point which contains at 
most one nonzero derivatives of H^- By the same logic all and are 
computable. 
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The initial probability distribution is chosen as 



p(p,q) = iVp^ ^ p„„(p,q) (19) 

n=0 m=l 



where 



P-n = 2e.+^^(^). 0(2"^-(pi - m))0(2"+^(p, - l/2))0(2>3) 
xn0(2"g,)0(2"+2(53_^)) 

(20) 

and {a(0), a(l), a(2)...} is a recursively enumerable nonrecursive set from N to 
N — {0}. In Eq. ( !20l) . the support of each pmn{p, q) is a 6 dimensional hyper- 
cube with three 2"""'"^ length (pa, qi, 52) sides, one 2~" length p2 side, one 2~""^ 
length g3 side and one 2""""^+^ length pi side, centered at (pi,P2,P3, Q's) = 
(m, 1/2, 0, 0, 0, n). (See Fig.H]) The probability inside each p^n is / dpdqpmn = 

2-12n-2m-3/^(^^)n^ 

Again, each pmn is computable and none of the supports of pmn overlap. For 
any computable point if we choose radius sized ball centered at that point 
with large enough N it overlaps at most one nonzero pmn which is a computable 
function. From definition [5] p(p, q) is computable. Np is the normalization 
constant, which makes / p{p,q)dpdq^ = 1. Np is also a computable number 
since the sum S^.n / dpdqpmn = J2m,n 2~"'^^"~^™^^/a(n)" converges faster than 
a geometric series. 



Next we compute the information and entropy of initial probability distri- 
bution. In Eq. (jlj), the entropy is defined through the probabilities with a 
computable partition. Here computable partition means that boundaries of 
the partition are made with computable functions and any computable finite 
area can be covered by increasing the number of partitions in a recursive way. 
Let us choose the original partition as pt = k2~'^p and = k2~^p surfaces 
[i = 1,2,3. Up is a (possibly large) natural number, k = 0, 1,2, ...). Then the 
smallest cell in this partition is 6 dimensional hypercube with each side 2~"p 
and volume p = 2"^"''. For a given scale a which defines the unit volume, 

— log(/i/a) is the precision one can get for the volume and — log(2~"''/a) is 
the precision for each coordinate. For now let us choose the scale a as 1. 

Under this partition, for any natural number Up, the information part is 

— J2i PiP'i log Pi < {6n + m)2~^"'~"^ for each pmn pulse. The total p the informa- 
tion part is dominated by Z]m,n(6^+^)2~^"~'" which is effectively convergent. 
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so the initial information under this partition is computable. The information 
capacity part is — log /i and also computable. 

Now we consider the time evolution of probability distribution and its entropy 
and information under the original partition. Since the Hamiltonian time evo- 
lution is measure preserving, the term — J2i Pii^i log Pi is conserved during the 
time evolution if we make finer partition. But the information capacity term, 
which depends on the finer partition and ability to resolve the probability 
distribution with original partition , increases with time as the second term 
in RHS of Eq. ^ shows. 

The Hamiltonian equation is given by 



OH . dH 

Pi = Qi = (21) 

oqi opi 

and the solution of the Hamiltonian in Eq. f|T7j) is (Note that all pmn are 
between n — l/A<q3<n + l/4 for any n.) 



for n - 1/4 < gs < n + 1/4, 

p,{t) = j9i(0) exp(-e"*+P^(°) + eP^(°)), qi{t) = gi(0) exp(e"*+P^(°) - e*'^(°)), 
P2it) = nt + p2i0), q2{t) = eP^(°)pi(0)gi(0)(e'^* - 1) + ^2(0), 
P3(0)=P3(0), g3(t)=g3(0). (22) 

This solution shows exponential of exponential squeezing and stretching in pi 
and qi directions. For example a rectangle in piqi space with side lengths 5pi 
and 8qi at t = is stretched to a rectangle with side lengths 5pi exp(— e"*"^^2'^°^ + 
eP2(o)) and 5gi exp(e"*+P2(o) _ eP2(o))_ 

If the partition in piqi space is made with 5pi5qi cells, then the time evolution 
of one cell is stretched and overlaps at least A^^ = [exp(e"*^P2''°'* — e^^*^*^-*)] 
number of other cells in qi direction, ([x] means the largest integer not larger 
than X.) In pi direction A^^ number of partial cells are squeezed into one cell 
and at least log A^^ bits of resolution is needed to distinguish the thin strips of 
cells in the original one cell. 



In view of the information capacity term Y.i Pi(0)/i — o.jiif) log ? for 

the pi{0) which isinn — l/4<g3 <n + l/4 each ajiit) is around l/Ng and it 
is summed over A^, terms. So 



aji{t) log ajiit)^ log Ns. (23) 
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From Eq. dH), < ^2(0) < 1 for all pmn and 



exp(e"*) < exp(e"*+P2(°) - e^'^^")) < exp(e2"*) (24) 

for t > 2. If we consider the whole 6 domensional space the number of over- 
lapping cells are larger due to the exponential stretching in q2 direction. 
So for each pj/x segment which is in n — 1/4 < gs < n+1/4 the term 
— X] cijiif) logajj(t) is bounded by 



e"* log e < - ^ a,, (t) log a,, it) < e^"* log e (25) 
j 

for t > 2. Considering all pmn pulses for fixed n, the information capacity 
increase inn — l/4<g3 <?2+l/4is bounded by 



.^i2n-2»^-3 g"*logg ^ loge /22^y ^ 
^1 a(n)'^ 24 V a{n) J 



n-i/4<g3<n+i/4 ^ j / 24 V a{n) / 

By summing over n for the total probability distribution, we get 



loge ^ /2-i2e*\" / \ loge ^ /2-i2e2*\" 

(27) 

Like the example at the end of section [3], we see that the information capacity 
increase in time t = 0, 1,2,3, ... are finite but not bounded by any recursive 
function, for any fi. The information capacity is related to the ability to de- 
scribe how far a cell is stretched or how many other cells are squeezed into an 
original cell for probability pulses in fi accuracy. But this grows faster than any 
recursive function and we cannot find a recursive way to compute information 
within the original computable partition. 



5 Summary 



In summary, we defined information and information capacity in classical 
Hamiltonian system and showed an example in which the initial probabil- 
ity distribution and its information are computable, and the Hamiltonian and 
its derivatives are computable, but the information capacity increase is not 
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bounded by any recursive function. Its total entropy, which is defined through 
a computable discrete partition, is originally computable but its time evolu- 
tion grows faster than any recursive function. This total entropy is related to 
the precision required to compute the information, so the time evolution of 
information is not computable within the original computable discrete parti- 
tion. Even though the information is a conserved qiiantity in the Hamiltonian 
time evolution, the result shows that we might not actually compute it. 
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Fig. 1. The new discretized probability distribution pi{t). In the left figure, each 
square shaped cells has discretized probability density Pi{0)s. {i = 1, ..,4) After one 
discrete time step the cells are deformed (shown as dashed parallelograms). The 
new discretized probability density pi{t) in Ci(0) cell (the square with thick line 
in the right figure) is obtained by averaging the portions of probability densities 
moved into the Cj(0) cell. 
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-1.0 -0.5 0.5 1.0 

Fig. 3. Plot o^^l^{x). tp{x) is a function 
Fig. 2. Plot of (/>(x) is a C°° function with support [—3/8,3/8]. '4}{x) has con- 
with support [—1, 1]. slant value 1 between —1/4 < x < 1/4. 
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^1 

Fig. 4. The supports of pmn for fixed n with m = 1, 2, 3... in piqi space. 
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